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Abstract 
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Empirical Bayes point est-imates of true scoTe may be Obtained if the 
distribution of observed -score for a -fixed examinee ^is approximated in 
'one of several vrays by a vell-known compound binomial model, ^e , Bayes 
estimates of true score may ~% e^ressed in terms of the observed score, 
distribution and the dis^bution of a' hypothetical binomial test. The 
latter distribution is Sound by use of the compound binomial approxima. 
tion 4-ormuia and 'from relationships which exist between Bajres estimates 
and unconditional probabilities of observed score. Empirical Bayes. 
point estimates are obtained by use of the sample observed score 
distribution. , ' 
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.EMPirUCAL BAYES-JDIWT^ESTmTjlS OF TRUE SCORE 

1* • ' ' 12*'-' 

- \ A, COMPOUMD BBOffAL ERROR MODEL^^ ^ / ' 

Jack Kearns/ - . * 

* ■♦' 

- Educational Testing Service " ^.^ . 

, ^ •.-■^ : •■ .- . ■ -r" 

4centl^ ^'the ^tnpirical Hayes .api)roach ISas been applied to certain 
problems in mental te^t j:heoi*y. Empirical Bayes procedures are based^ 
upon the Bayes ass'unrption of prior distributions/ ^ut utilize empirical 
informatiorT'in lieu of? making specific distributional assumptions. Sevei^al 
mental test theory applications haye conceptualize^ the distribution of 
true scores aS a prior distribution. These have included methods for ■ 
the estimation bf the true score distribution Llord, 1969J and, methods 
■for obtaining point est'imtes of true -score ^Meredith, 1971; Meredith & 
Kearns, 1975 J. Both of these approaches ^ave used "strong" -true score 
theory, assumptions Lcf. Lord & H'.vick, I968, Part 6J which specify. the^ 
'form of ;iihe conditional distributl,on of observed scores for a fixed true 
: score, i.e., the error or propensity distribution for an individual. 

Point estimation methods have concentrated up(^n the development of ^ 
estimates which are asymptotically optimal, i.e.; estimates wh^c^i ap- 
pi-oach a Bayes point estimator as the sample size becomes increasincly 
Large. %ese estimates require essentially no a priori asgumptions a'bput . 
.the true score distribution' (occasional]^ some very general assumj^tions are 
made). The Bayes point estimators are highly advantageous in that they 
minimize the overall expected squared error loss. Tkis implies that the 

"Wksented at the VJjh Spring- Meeting of the Psychometric Society, 
Stanford diversity, March' pa-?9, 197^« 

\ am Indebted to Frederic M. Lord for a critical reading of this 
paper and t3 Dorothy T. Thayer for implementing the necessary ooniputer- 
programs , 



Bayes point estimates are as reliable ^s any other type of score |cf. 
Meredith. & Kearns, 1975, Section VI ] • The speed Vith which the empirical ^ 
Bayes, point estimator wilV approach the Bayes point estimator will depend ^ 
upon ^rarious characteristics of the propensilSy and^true_^co^e distributions^ 
.but, in general, the samples must be reasonably large to provide estimates--, 
which are superior to the maxinjum likelihood, estimates (observed score). 
An alternative approach has involved Various "smoothing" procJkures which 
have been shown to reduce overall expected loss considerably for relatively 

small samples, and are generally superior to the. use o:^ observed scores, 
but are not asymptotically optimal. • _ , 

■. The point estimation-problem has been explored for assumed Pqissdn ' , 

and binomial error distributions.. Thip study will extend this approach 

to the compound binomial error difltiribution used by Lor.d [1965,, .1969? 

which is applicable to a fairly la-rge class of nonspeeded; binlry item- tests 
Let the random variable, " taking on pa^.ticular values x 

which represent scores on an - N iter, te,t . Let T be the proportion 

corp^ct"true score i;andom^ variable with particular values t (-0 <»t <-1 )■ 
V g(T) is the distribution of true scores,' and f{x\r) is the error 

distribution, then the regression of T on X is given by . 



• j Tf(x|T)g(T)dT 
(1) e(T|x) - j Th'(T|x)d'T = ^ 



1 



Where h(T|x), is' the /'posterLor" distribution of T , giv^n x . <f it J 

possible to calculate (1), this "posterior" ^mean is equivalent "to the B/yes 

.^point estimator of/ T vh\ch minimizes the overall expected scjizared error. 
t . • - • ■ 

loss _[^ritz, 1970,' p. ^l- ■ '< I /" 

■ If an indiv^^uil' s resp'onses-* to th4 N items are exp^rim^tally : 
independent, with a vector of probabilities of passi^^t'g I -^\\ ( § = 1- 
...,N ), theny f(xlT) is a co^ound binomial distribiitidn, where 

' 1 .| ^. . This distribution depends upon the vector but may be 
expressed in terms of' r as a finite serieB [Lord, 1969] 

^(2) p,(xit) = (^^^a-)'"^/IVv)M-'^)":' 

• 2 

+ [ |v^(^t) - |N\2(^t)V5(t,T)]C5(x,r) ^ - 



A. 

^ f 9 



+ • . . . (N .terms) > 



where 



■1 ^ / ■ ^i 
V J^t) = I £ - \) 

g=l 



and 



^ vt=0 . 



where 



P„(x|t) = 



'N^ . \ 0 • §therwise 

Lord-.has s.tudied approxi::BtioL which ;etain only "t^e. first ^fe. tenns „of 
(2) and, approximate the' V^(t,T)\y functions of -y , e-g.y 

(5) . 'v^C^t) i V^Ct)-^ f t(1 - t) . , 

';,her«. k i^ a constant which .ust'be estirn^ted- We shall ■consider here 
the ^pproxiuBtion obtained by retaining only the first t.^ Urms. If 
solely the first term is' retained, (?) reduces to the binomial error 
model, wl)ich must be discussed first. 

* ' ■ , , THE BINOMIAL ERROR MODEL- • 

\.2n 'f(^\r) is binomial (N,t) [cf. Robbins, 19551, equation (l) 

becomes ' • 

« » 

♦ J . S 

p fx + 1) 

1 V X + 1 %^1^ 

' where the ^luea of P^(x) are the probabilities of the unconditional 
distribution of observed scorpc . With infprmat ion from only N items, 
this regression is indeter^Lnato [Lord 4 Novick, I968, p. ^A^] cince the 
distributiqn represented by the values of p^^^^ Cx + l) is unknoim . 
Bayes point estimates of T are obtained by considering outcomes for 
onlj- the first N r I items, i.e., ^ 



If the items are "truly 'equivalent, then^any item may be deleted to obtain 
such, an estxmate. T^e substitution of sample' proportions in (5) for the 

f ^^ «r,rl (x) gives one an asymptotically optimal 

values cf V^i^ + ^N-l^^ ^ 

empirical Bayes estimator. 

■ , Meredith and Kearn^' [ 19751 have developed a procedure for assigning' 
■ B,yes point, esfln^tes to^individuals who obtain a given score on the « 
item tes,^, ^This depends up«,n the following result for the binomial . 
distribution tcf.'l^rd & Novick, 1968, p.' 365, 'coroUaryl. , 

(6) ' ' - "-K " ■ t 

This conditional ■di^tributix.n is independent of the parameter r , and 
hence is valid for the entire population of individuals . If ..e ^et = 
but t'reat " X,.," as a random variable which can ass^ume^only the two value 
\ and X - 1 , then the expected value of ^. |X^._,) conditional 
, upon is • ■ ^ ^ 

This is a population estimate which may be assigned to an individual on 
the basis of his score on the enti/e N- item test. .This estimte is 
based upon informtion from o.n> N items. It is not to be identified 



with the regression estimatlpf (M). which requireg i^ifortnation ,from^ • N + 1 
items in orders to be detemdled. The estimate of,;(r) contains no more 
information than the regression estimates of (5).. It represents a 
probabilistic assi^ment, based upti the distribution of (6), to one of 
two^ssible outcomes on an N - 1 item test obt&ined by deleting arj 
item. Meredith and Kearns [1973] used this result to obtain the estimate 
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Where the supei'Script (i) indicates that item i has been deleted. This 
equation combines results from all possible item deleted subtests and re- 
duces the overall expected loss when the. e(TlXj^_i. - x) are e^timated^ ■ 
empirically. 'This assignment procedure may be 'extended to outcome^ on an 
N + 1 item test, i.e., . » * . . 

(8) e,,,(vi(^)lxi = WVi'^'Vi '"'^^i'- 

Again, this estimate contains no more information than (,?) or (5): . 

With increasing N , there is a corresponding increase in the - 
number of moments ( N ) of the true score distribution which may be ob- 
tained from the observed ^core distribution [l^rd & Novick, 1968, p. 521] 



• The true score distribution can never be "uniquely determined, b,ut, with 
increasing N , tiie class of -possible true score distributions becomes. . • 
increasingly restricted. addition, the relative amount of information - 
sacrificed by using the estimate of -(5) decreases . In this context, the 
^ estitrate of (T) mfty be" seen' as appropriate for X^'^ x- , but bas.d upon 
' an "inadequate amount' of informati6n about the true score distribution. 

The amount of informUon needed for the regression in (K) is -equivalent ' 
to that obtained with N + 1 ' Items . Since the assi^ment procedure of , 
■ (7) may be extended to Bayes estimates obtained w^ fewer items (as in. 
>(8)),' it becomes possible to observe the way In which, for fixed = x , 
' these assigned estimates cl^nge as a function of increasing N . If this 
functional relation is sufficiently regular, it should be possible to 



N and extrapolate to N -J* 1 • This 



- V 



fit a iurve- for several values ^ 
extrapolated e^stinate should b, a close approximation to th^ndnidentifiable 



estimate of 



THE COMPOUND fi'lNOMIAL E^mOR MODEL 



The two term approximation of (2) using the' additiona]| approximation 
of (5) may be written, in terms of t , as 

2 
v^ 



(9) V^It) = pJxIt) + kT(i - t) • (-ir^( I )Pn-2(^ ^ ^^-"^ ■ ' 



I/^rd [1965, 1969] suggests pbtaining k such that the correlation Ipetx^een 
true and observed .^^^is equal to the square root of the Kn^der- Richardson 
formula-20 reliability. _ This value of k is 
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2 2 
N (N - l)o^ 



(10) • k = 2 • 2 , 

2[^^(N - - - Na^_]. 



where 'v^ and -a^ are the mean and variance of X^^ , and o\ is the 
variance of the item difficulties. The unconditional distribution of 



is, using (9),. 

(11) * P (x) =7 Pjj(xlT)g(T)dt 



0 



^ , xV+1/ 2 V fx.- V +..^ )(N - X + V - 1) p (x - V ■ 
. p^(x) - k (-1) ( ^ ) ^ , H'^N - 1) ^ ^N^^ ' 



^From the definition following (2), p^.^Cx - vlr) = 0 unless^O .< x - v < N 

/n \ fv - v„+ 1) is equal to zero under 

Consequently, in equation (11), Pj^^x v + ±; x h 

u ^ l<r ■rr nl "v > N -" 2' +' V . With the 
the same .conditions, -i.e.,^ when x j< v or . x ^ N 

value of k obtained from (lO-), equation (ll) represents N . 1 ' 
equations in the N * 1 unlaiovms, Pfj(^)'" • ' 

Let- 



<- 



ytlx) = yT|x^ = x)v ^ ' 

for the coirilpound binomial test, an(i 

* * • ej^(Tlx) = ejj(T|x^ = x) 

for the hypothetical binomial .rror te.t whose distribution i» represented 
by the value; of P„(x) . .n.an the Bayes .point esti^te of T obtained 

f ' 

•Ijy substitution of (9) into (l), is • _ 



* 
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j TP^(x|T)g(T)dT 

(12) EjjLtIxj =Srr~ — 

/' ^ / Pjj(xlT)g(T)dT-' \ 



N + 1 . v=0 



. r ' fx - v> 2)(x - V + 1)(N - X + V - 1), (x - v 2))] 
. ^ " • (N + i)n(N - i) N+1' 



■ , :v+i/ 2 \ 

. [ Pn(x) E (-1) ( ^ ) 



N(N - i; ^ 

Dividing through by Pjj(x) and letting 

^ x(N - x) 

A(x) ^ 1 + 'A ^{n _ ij 

B(x) ^ k N(N - l) ' Pn^^^. 

C(x) = k N(N - 1) V/^) 



} 



ierJc 



we may write (12) as 

Arx).^/Tl x)--B(x)e.jT|x.l-)-C(x)e^(T|x-l) 

(15) Ej^LtIx] \ A(x) - B(x; - c(x) ~ 

. - 12 * 



Th^. values of pjx) inay be obtained -by solution of the .se.t of W + 1 ; 

equations giveo by (11; • . 4,.-* . ^' . .. ■ 

- ' The prge>eduxes^.outlined in the p^evio^ section «ay be usej, tp estitote 
ej^(Tlx) ^ with' the /^^flo^ng necessary "inDdificaticyn . .The porobabilities^ . 
p ' (x) - used in (5) can l?e obtained, in iSie case of a binoir^l error- 
distribution/by deleting any item and observing tjie distribution.^ .• • 
Alternatively ertiuation (6) may be used toVrite , . ^ 

This equation may j)e used in the casi' of the hypothetical valu|S of ^ 
.p^(x); -determined by (H)^ T^^e extrapolated estimates of e^i^U) may ' 
'be found by the indicate^ prc.ceduxe and substituted into (l3) • " . . . 

If ;as''is the usual case,,, we have oni^ sample e'stim&tes of the 
^P^(x) , the estimation procedure must be axaMned with regaf^to sampling 
friability and its effect upon the overall eljecteds squared-error loss, 
which is a random variable over repeated sampling Uf . • Maritz, _ 1970]. ,. 
This iS%e empirical .Bayes situation which we shall consider next. 

EMPIRICAL MYES ESTIMATES . ' 

Simple empirical Bayes , estimates may be^ob<tained for_the regression 
"of (13) by substitution of sample proportions, P^(x) , .-for the F^(x) 
of equation (l^i). How^v^r, these are not likely t^ be the best^ estimates 
in terms'of minimizing the overall expeoted squared error loss [cf. ■ 
Maritfc, 1970, pp. 17-I8] unless sufficiently large samples are used. 
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Several empirical Baygs estimators have been proposed which dan 
minimize this loss for small or moderate samples and are applicable ^0* ^ 
the binomial ertor model [Lemon & Krutchkoff, " 196^; Grimn & Krutchkoff' 
■1971; Copas/ 1972; Bennett. & Martz, 19/?; Meredith & Kearns,. 1975] • 
However, ^htle these procedures generally have.desirable small-sati5)le: . 
properties, they .are not usually the best procedures for extremel^r large 
samples, they are not asymptotically, optical . However, two of 

these methods [l^mon & Rrutchkoff, I969; Bennett & Martz,^ 19723 
approach asymptotic optimality as^ N -> « , and are approximations Of (k) 
rather than (5). ,The advantage o^ these'two smoothing pxs^cedures is there'- 
.fore a function of the size of the sample relative to the value of N . A 
' particular method may be advantageous d^ending upon the particular char- 
acteristics of the 2W"ii)ution of ^ scopes. . AH qf - these procedures 
Tnay.be used to estimate e^(T|x) ft-om the .p^<x) obtained from (11^ 
"An additional empirical consideration is the stability of the ratios 
C Vy'^ an* [?4^3''needed fok^(x)land C(x). If the s^ 

ratios are estimated by substitution of the values of v/^) , they are 
likely to increase sampling ^error' unless the sample is quit_^ large. 
Lord [1959] has shown that' the recurrence relation 



' . " p fx - 1) ; 

(15) ■•v^ix) = i- — - — i7^v'^'^ ^ 

holds for x=l,..*.,N in the population. This represents X»equations 
in N + .1 unknown and reflects the indeterminacy discussed in terms of 
equation (k). For sufficiently large s^mple^s, ^e should require that the 
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^estimates of S/n\x) . satisfy the constraints -represented^by (15). 
Alternatively, equation (15 ) provides a njgthod for estimating - ©(x) 
k-nd C(x) if estinSltes Qf e^(T |x) are available, i.e.. 



(16') • (x) = ( 7TT 1 - yT|x>i) 



N 



and 



■ .. ..-^ ■ • - 

- P„(x - 1) • L ^ 1 - 



"^^^ ■' ■ APPLICATIONS 

An'exaile is taken from Meredith and Kearns [19T5] • The data represent 
a sample of TTl8 respondents on an ll-ite« subtest selected from items of the 
School and C6li.ege Ability Te's* . The items were selected to. have approxi- 
"mately eaual item difficulties. . The coti^uted^ value of was .000859- . 

Table 1 gives res-ults' f or the binomial error mo'del assumption. The _ 
Assigned estimate] Qorresponding to' (7). are shown along with extensions'^ 
(as in (8)) based upon information from items' ( N* < N ) . In addition, 

the extrapolated estimates of the regression function are given. The 
extrapolated estimates for this and, th^ folldwing example were obtained by 
fitting a quadratic curve to the "last"^four points,, i.?., for N* 
equal to N . - 1 , N - ■2 , N - 5 , • and N - > Table '2 shows" the re- 
sults for the compound binomial' error model using the same data. The • 
assigned and e^^trapolated estimates ft^r 'fe hypothetical binomial error 

distribution obtained from (ll) are shown. These values are also .presented 

** ■ . - . 

in Figure 1. The empirical Bayes estimates corresponding to (15) are 
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; Figure. 1 

>i ■ . 

Data from Meredith and Kearns [1973] 
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Conipoiydd Binomial Model 
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3Un for both (a) substitution of the values of and (b) u| of 

Cl6)ld (17) toestin«te^B(x) and C(x) . Since , the_ items a.^e s^O 

'sindla An difficulty, it is not surprising that these estimates cor- 

V' * ' 

'respond 4ry closely to those :in Table 1. . 

'■ A sJcond Wnrple f rom Ix.rd 1965] uses one of the sixte.n dis- 

'trfbutions'armlyzea in that study ( B - 25 , ' • 
sample = 1000). Table 3 gives the result, of a-pp.ying both the com- 

pound binomial and binomial models, For- all estimates there is general 
lack of monotonicity «hich reflects the smaller sample si.e (and larger^ , 
nu^erof ite;B). In addition, the esti^tes appear ,uite erratic where. . 
the frequencies are small (near " x - 0 ). This suggests that a smoothing^ 
procedure should be used. Bote 'that the results of the binomial and 

- compound binomial are similaf although both are Jagged., . ^ 

" -The- smoothing proce<^e of Lemori and Krutchkoff [1969] vas^ applied 
using thV ?^(x) 'of Table 5 • This procedure essentially obtains estim^t^s 
by smoothing the^p^(x) . The estirmtes appropriate for the binomial 

distributioiv are * 



'\ %(t1x)=^^^ 



I [t/[i-?/:" 

i=0 



where i = 1, ...,n refers to a summation over the sample and T. is 
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' • . ■ • • "-18- ' • • . ■ 

some estimate of true scor. for individual.!^.' Following I.«on and, ■ . 
KfutcM^ff, th. initial *ati^t, of 'T, le, and an.iterated smoothed. 

est4,te is obtained by setting T. equal to the Initial smoothed esti-->,. 
mate, '^(T |x. ) . The , iterated s'moothed. estimates are aho»n in Table H albng 
«dth the corresponding, compound binomial estimates Using (l6).and (17). ^ ^ 
These.e,sti»^s_appear quite 'smooth and e*ibit monofonicity throughout 
the range of x . ° ^ . ' . ' . • " • 
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Table h 
Smoothed Estimates 



e*(Tlx) 

.218 
.255 

.275 
.29k 

• 515 
.558 
.562 
.586 
.kl6 
M6 

■ 1508 
.559 
.571 
.602 
.6^k 
.666 
.698 
.728 
.757 
.785 
.807 
.852 
.^60 
.891 



E*(tIx) 

.192 
.2i5 

.256 r 
.258 
.281 
.505 
.527 ■ 
■ .551 
.578 

.^k 
, >557 
.569^ 
.602 

.655 ' 

.669 

.702 

.765 
^ .790 
.815 
.8lil 
.870 
.902 
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